Option panel : Links
- Attempt to detect all links
Asks the engine to try to detect all links in a page, even for unknown tags or unknown javascript code. This can generate bad requests or error in pages, but may be helpful to catch all desired links
Useful, for example, in pages with many javascript tricks
- Get non-html files related to a link
This option allows you to catch all file references in captured HTML files, even external ones
For example, if an image in an Html page has its source on another web site, this image will be captured together.
- Test validity of all links
This option forces the engine to test all links in spidered pages, i.e. to check if every link is valid or not by performing a request to the server. If an error occured, it is reported to the error log-file.
Useful to test all external links in a website
- Get HTML files first!
With this option enabled, the engine will attempt to download all HTML files first, and
then download other (images) files. This can speed up the parsing process, by efficiently scanning
the HTML structure.
Back to Home
|
|
|